XML transformation language

Data transformation/Source transformation
Concepts
metadata · data mapping
data transformation · model transf.
Languages
ATL · AWK · MOFM2T · QVT · TXL
XML languages
Techniques and transforms
identity · synthesis · refinement
Applications
data migration · data conversion
ETL · program transformation
Application fields
Data warehouse
Software engineering
Software languages: macro, preprocessing, template

An XML transformation language is a programming language designed specifically to transform an input XML document into an output XML document which satisfies some specific goal.

There are two special cases of transformation:

Contents

XML to XML

As XML to XML transformation outputs an XML document, XML to XML transformation chains form XML pipelines.

XML to Data

The XML to Data transformation contains some important cases. The most notable one is XML to HTML, as an HTML document is not an XML document.

Existing languages

XSLT
XSLT is the best known XML transformation language. The XSLT 1.0 W3C recommendation was published in 1999 together with XPath 1.0, and it has been widely implemented since then. XSLT 2.0 has become a W3C recommendation since January 2007 and implementations of the specification like Saxon 8 are already available.
XQuery
XQuery is a full functional language, despite having "query" in the name. It is a de-facto standard used by Microsoft, Oracle, DB2, Mark Logic, etc., is the foundation for the XRX web programming model, and has a W3C recommendation for versions 1.0. XQuery is not written in XML itself like XSLT is, so its syntax is much lighter. The language is based on XPath 2.0. XQuery programs cannot have side-effects, just like XSLT and provides almost the same capabilities (for instance: declaring variables and functions, iterating over sequences, using W3C schema types), even though the program syntax are quite different. XQuery is logic driven, using FOR, WHERE and function composition (e.g. fn:concat("<html>", generate-body(), "</html>")). In contrast, XSLT is data-driven (push processing model) where certain conditions of the input document trigger the execution of templates rather than the code executing in the order in which it is written.
XProc
XProc is an XML Pipeline language. The XProc 1.0 W3C Recommendation was published in May 2010.
STX
STX (Streaming Transformations for XML) is inspired by XSLT but has been designed to allow a one-pass transformation process that never prevents streaming. Implementations are available in Java (Joost) and Perl (XML::STX).
XML Script
XML Script is an imperative scripting language inspired by Perl that uses the XML syntax. XML Script supports XPath and its proprietary DSLPath for selecting nodes from the input tree.
FXT
FXT is a functional XML transformation tool, implemented in Standard ML.
XDuce
XDuce is a typed language with a lightweight syntax, compared to XSLT. It is written in ML.
CDuce
CDuce extends XDuce to a general-purpose functional programming language, see CDuce homepage.
XStream
XStream is a simple functional transformation language for XML documents based on CAML. XML transformations written in XStream are evaluated in streaming: when possible, parts of the output are computed and produced while the input document is still being parsed. Some transformations can thus be applied to huge XML documents which would not even fit in memory. The XStream compiler is distributed under the terms of the CeCILL free software license.
Xtatic
Xtatic applies methods from XDuce to C#, see Xtatic homepage.
HaXml
HaXml is a library and collection of tools to write XML transformations in Haskell. Its approach is very consistent and powerful. Also see this paper about HaXml published in 1999 and this IBM developerWorks article. See also the more recent HXML and Haskell XML Toolbox (HXT), which is based on the ideas of HaXml and HXML but takes a more general approach to XML processing.
XMLambda
XMLambda (XMλ) is described in a 1999 paper by Erik Meijer and Mark Shields. No implementation is available. See XMLambda home page.
FleXML
FleXML is an XML processing language first implemented by Kristofer Rose. Its approach is to add actions to an XML DTD specifying processing instructions for any subset of the DTD's rules.
Scala
Scala is a general-purpose functional and object-oriented language with specific support for XML transformation in the form of XML pattern matching, literals, and expressions, along with standard XML libraries.
LINQ to XML
LINQ to XML is a .NET 3.5 syntax and programming API available in C#, VB and some other .NET languages. LINQ is primarily designed as a query language, but it also supports XML transforms.

See also